Today was released the paper that describes the Propensity to Cycle Tool (PCT) and some of the thinking behind it (Lovelace et al. 2016). For those new to the PCT, it’s an online tool for helping to decide where to prioritise cycling policies such as new cycle paths.
The PCT has been 2 years in the making and there are many things to say about the it, from the methods that underly the estimates of cycling uptake to evidence-based policy. This post focusses on just one aspect of the PCT, however: its use of CycleStreets.net to convert straight travel desire lines into wiggly paths on the road network. These wiggly lines, such as those illustrated below (from Figure 9 in the paper), represent routes that someone cycling could plausibly take.
Before saying something about the routes themselves it’s worth taking a step back to look at the underlying data behind the PCT. The PCT relies on orgin-destination (OD) data. That is simply data in the following form:
| origin | destination | V1 | V2 |
|---|---|---|---|
| 1 | 2 | 100 | 3 |
| 1 | 3 | 50 | 5 |
What this example OD table means is that 100 units of ‘V1’ and 3 units of V2 travel between zone 1 and zone 2. There is also movement represented between Zone 2 and 3.
Now, imagine that V1 represents the total number of people travelling between the origin and destination and that V2 represents the number who regularly cycle. That is the basic input dataset that we are using as an input into the PCT. We use 2011 OD data on travel to work, because that is the most comprehensive dataset on travel patterns available. (Note: we are using open data made available from the http://wicid.ukdataservice.ac.uk/ website).
One problem with OD data is that the rows do not tend to have geography inherently built in. They could contain a variables called lat_origin, lon_origin, lat_destination and lon_destination. But generally they only contain the IDs of geographic zones. Therefore work is needed to convert the OD data into desire lines. Desire lines are straight lines representing where people would go if they were not constrained by the route network, as illustrated in Figure 3 of the paper:
To do this work an R package was developed called stplanr. After R and RStudio have been installed on your computer, this software can be installed from CRAN with the following command:
install.packages("stplanr")
Once it is installed, the package can be loaded as follows (note that it depends on the sp package):
library(stplanr)
## Loading required package: sp
This gives you access to many functions for working with transport data, with a focus on geographical data. It also provides example data from with desire lines, and eventually routes generated by CycleStreets.net can be generated. Let’s take a look at some real OD data provided by stplanr:
data("flow") # load the 'flow' dataset from the stplanr package
head(flow[c(1:3, 12)])
## Area.of.residence Area.of.workplace All Bicycle
## 920573 E02002361 E02002361 109 2
## 920575 E02002361 E02002363 38 0
## 920578 E02002361 E02002367 10 0
## 920582 E02002361 E02002371 44 3
## 920587 E02002361 E02002377 34 0
## 920591 E02002361 E02002382 7 0
This shows that, between zone E02002361 and E02002361 (i.e. intrazonal flow) there were 109 people travelling to work by all modes in the 2011 census. 2 of them cycled. The equivalent numbers for the OD pair E02002361 to E02002371 were 44 and 3. But how to make this data geographical?
For that we need another dataset, also provided by stplanr as follows:
data("cents") # load the 'cents' dataset
head(cents)
## class : SpatialPointsDataFrame
## features : 6
## extent : -1.550806, -1.511861, 53.8041, 53.82887 (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## variables : 4
## names : geo_code, MSOA11NM, percent_fem, avslope
## min values : E02002361, Leeds 032, 0.408759, 2.284782
## max values : E02002393, Leeds 064, 0.591141, 5.091685
What’s interesting about this cents dataset is that it’s geographical, and can be plotted on the map without issue, as illustrated below:
library(tmap)
osm_tiles = read_osm(bb(cents, 1.4))
(map = qtm(osm_tiles) +
qtm(cents, symbols.size = 5) )
The stplanr R package enables linking the non-geographical flow data in the flow data frame with the geographical data plotted above and contained in the cents data object. Let’s take a single OD pair, E02002361 to E02002371, the fourth row represented in the table above, to see how this works:
flow_single_line = flow[4,] # select only the first line
desire_line_single = od2line(flow = flow_single_line, zones = cents)
Now we can plot this on the map as follows:
map +
qtm(desire_line_single, lines.lwd = 5)
Note that the R function od2line() is generic in the sense that it will work the same if you give it a single OD pair or a table of thousands. To create desire lines for all OD pairs stored in the dataset flowlines, we enter the following R command:
l = od2line(flow = flow, zones = cents)
This creates the geographic data object l, which can be visualised as follows:
map +
qtm(l)
Now the data is set-up thanks to the work done by stplanr, we can change the visual appearance of the desire lines with a single extra argument passed to the plotting function. Let’s make width depend on the total number of people travelling along the desire line:
map +
tm_shape(l) + tm_lines(lwd = "All", scale = 10)
Another fun thing we can do is to set the colour relative to the number of people cycling, as follows:
map +
tm_shape(l) + tm_lines(lwd = "All", scale = 10, col = "Bicycle")
OK, with all that fun out of the way (which is really the core of the data processing behind the PCT), we can now move onto the purpose of this article: to describe the routing functionality of CycleStreets.net.
Lovelace, Robin, Anna Goodman, Rachel Aldred, Nikolai Berkoff, Ali Abbas, and James Woodcock. 2016. “The Propensity to Cycle Tool: An Open Source Online System for Sustainable Transport Planning.” Journal of Transport and Land Use 10 (1). doi:10.5198/jtlu.2016.862.